A K-fold Averaging Cross-validation Procedure.
نویسندگان
چکیده
Cross-validation type of methods have been widely used to facilitate model estimation and variable selection. In this work, we suggest a new K-fold cross validation procedure to select a candidate 'optimal' model from each hold-out fold and average the K candidate 'optimal' models to obtain the ultimate model. Due to the averaging effect, the variance of the proposed estimates can be significantly reduced. This new procedure results in more stable and efficient parameter estimation than the classical K-fold cross validation procedure. In addition, we show the asymptotic equivalence between the proposed and classical cross validation procedures in the linear regression setting. We also demonstrate the broad applicability of the proposed procedure via two examples of parameter sparsity regularization and quantile smoothing splines modeling. We illustrate the promise of the proposed method through simulations and a real data example.
منابع مشابه
Long-term Streamflow Forecasting by Adaptive Neuro-Fuzzy Inference System Using K-fold Cross-validation: (Case Study: Taleghan Basin, Iran)
Streamflow forecasting has an important role in water resource management (e.g. flood control, drought management, reservoir design, etc.). In this paper, the application of Adaptive Neuro Fuzzy Inference System (ANFIS) is used for long-term streamflow forecasting (monthly, seasonal) and moreover, cross-validation method (K-fold) is investigated to evaluate test-training data in the model.Then,...
متن کاملC Cross - Validation
Definition Cross-Validation is a statistical method of evaluating and comparing learning algorithms by dividing data into two segments: one used to learn or train a model and the other used to validate the model. In typical cross-validation, the training and validation sets must cross-over in successive rounds such that each data point has a chance of being validated against. The basic form of ...
متن کاملCross-Validation and Mean-Square Stability
k-fold cross validation is a popular practical method to get a good estimate of the error rate of a learning algorithm. Here, the set of examples is first partitioned into k equal-sized folds. Each fold acts as a test set for evaluating the hypothesis learned on the other k − 1 folds. The average error across the k hypotheses is used as an estimate of the error rate. Although widely used, espec...
متن کاملComplete Cross-Validation for Nearest Neighbor Classifiers
Cross-validation is an established technique for estimating the accuracy of a classifier and is normally performed either using a number of random test/train partitions of the data, or using kfold cross-validation. We present a technique for calculating the complete cross-validation for nearest-neighbor classifiers: i.e., averaging over all desired test/train partitions of data. This technique ...
متن کاملModel Averaging With Holdout Estimation of the Posterior Distribution
The holdout estimation of the expected loss of a model is biased and noisy. Yet, practicians often rely on it to select the model to be used for further predictions. Repeating the learning phase with small variations of the training set, reveals a variation on the selected model which then induces an important variation of the final test performances. Thus, we propose a small modification to th...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Journal of nonparametric statistics
دوره 27 2 شماره
صفحات -
تاریخ انتشار 2015